Skip to content

Add Vertex AI provider support for Claude models#152

Open
fcorrea wants to merge 1 commit into
1jehuang:masterfrom
fcorrea:feat/vertex-ai-provider
Open

Add Vertex AI provider support for Claude models#152
fcorrea wants to merge 1 commit into
1jehuang:masterfrom
fcorrea:feat/vertex-ai-provider

Conversation

@fcorrea
Copy link
Copy Markdown

@fcorrea fcorrea commented May 6, 2026

Add Vertex AI provider support for Claude models

Summary

  • Add support for routing Claude requests through Google Cloud Vertex AI when ANTHROPIC_VERTEX_PROJECT_ID and CLOUD_ML_REGION (or ANTHROPIC_VERTEX_REGION) are set

  • Introduce AuthMode enum (ApiKey | OAuth | Vertex { url }) in anthropic.rs to cleanly separate the three authentication paths without branching on booleans

  • Implement Google Application Default Credentials (ADC) token fetching with in-memory caching, supporting:

    • GCE/Cloud Run metadata server (automatic on GCP)
    • authorized_user credentials (from gcloud auth application-default login)
    • service_account credentials (from a downloaded service account JSON key)
  • Service account JWT signing uses ring (already a transitive dependency via rustls) ��� pure Rust, no subprocess or openssl dependency required

  • Build the correct Vertex AI streaming endpoint: {region}-aiplatform.googleapis.com/v1/projects/{project}/locations/{region}/publishers/anthropic/models/{model}:streamRawPredict

  • Handle Vertex-specific API differences:

    • model is stripped from the request body (it's in the URL)
    • anthropic_version: "vertex-2023-10-16" is sent in the body instead of as a header
    • No anthropic-beta header is sent (Vertex rejects unknown beta headers; prompt caching works natively via cache_control blocks in the request body)
  • Add build_anthropic_vertex_route() so the model picker shows Vertex AI as a distinct provider

  • Initialize AnthropicProvider when Vertex env vars are present, even without Claude OAuth credentials (startup.rs)

  • Update the model picker UI to show "Vertex AI" as the provider when Vertex env vars are set (inline_interactive.rs)

Edge cases and tradeoffs

  • Token refresh: Google ADC tokens are cached in-memory with a 60-second safety margin before expiry. On GCE this hits the metadata server; elsewhere it refreshes via the OAuth2 token endpoint or a signed JWT for service accounts.

  • global region: When CLOUD_ML_REGION=global, the endpoint uses https://aiplatform.googleapis.com (no subdomain), matching Anthropic's SDK behavior. Multi-region identifiers us and eu are also accepted by Vertex.

  • Prompt caching: Fully supported on Vertex — cache_control blocks flow through in the request body unchanged. The anthropic-beta: prompt-caching-2024-07-31 header is intentionally omitted because Vertex rejects unrecognized beta headers.

  • Service account key format: ring accepts PKCS#8 DER (the format Google generates for service account JSON keys) with a PKCS#1 DER fallback. The PEM headers are stripped and the body base64-decoded before passing to ring.

  • Code size: anthropic.rs grew significantly. The ADC/JWT machinery could be extracted to a separate module in a follow-up.

Validation

  • Tested locally with ANTHROPIC_VERTEX_PROJECT_ID, CLOUD_ML_REGION, and authorized_user ADC credentials. Full multi-turn conversations with tool use and prompt caching worked correctly through Vertex — this PR was written while jcode was actively running through Vertex AI.

  • cargo fmt --all -- --check: passes

  • cargo check -p jcode: passes

  • cargo clippy on changed files: no new warnings

  • Panic budget: no new unwrap/expect in our changes

  • Pre-existing budget failures (jcode-import-core, jcode-tui-mermaid, unrelated files) are present on master before this change

Environment variables

Variable Required Notes
ANTHROPIC_VERTEX_PROJECT_ID Yes GCP project ID
CLOUD_ML_REGION Yes (or ANTHROPIC_VERTEX_REGION) e.g. us-east5, global, us, eu
GOOGLE_APPLICATION_CREDENTIALS No Path to service account JSON; falls back to ~/.config/gcloud/application_default_credentials.json

View in Codesmith
Need help on this PR? Tag @codesmith with what you need.

  • Let Codesmith autofix CI failures and bot reviews

Route Claude requests through Google Cloud Vertex AI when
ANTHROPIC_VERTEX_PROJECT_ID and CLOUD_ML_REGION (or
ANTHROPIC_VERTEX_REGION) are set.

Authentication via Google Application Default Credentials:
- GCE/Cloud Run metadata server (automatic on GCP)
- authorized_user credentials (gcloud auth application-default login)
- service_account credentials (JSON key file)

Service account JWT signing uses ring (already a transitive dep via
rustls) - pure Rust, no subprocess or openssl dependency required.

Vertex-specific API differences handled:
- model stripped from request body (encoded in the URL instead)
- anthropic_version sent in body as 'vertex-2023-10-16' (not a header)
- anthropic-beta header omitted; prompt caching works natively via
  cache_control blocks in the request body
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant